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Abstract. Motivated in part by a problem of combinatorial optimization and in 
part by analogies with quantum computations, we consider approximations of or- 
thogonal matrices U by "non-commutative convex combinations" A of permutation 
matrices of the type A = ^ AaU, where a are permutation matrices and are 
positive semidefinite n x n matrices summing up to the identity matrix. We prove 
that for every n x n orthogonal matrix U there is a non-commutative convex combi- 
nation A of permutation matrices which approximates U entry-wise within an error 
of cn 2 Inn and in the Frobenius norm within an error of clnn. The proof uses a 
certain procedure of randomized rounding of an orthogonal matrix to a permutation 
matrix. 



1. Introduction and main results 

Let On be the orthogonal group and let Sn be the symmetric group. As is well 
known, Sn embeds in On by means of permutation matrices: with a permutation 
a of {1, ... , n} we associate the n x n permutation matrix 7r(a"), 



1 if a{j) = i 
otherwise. 



To simplify notation, we write a instead of 7r(a), thus identifying a permutation 
with its permutation matrix and considering 5'^ as a subgroup of On- 
In this paper, we are interested in the following general question: 



• How well are orthogonal matrices approximated by permutation matrices? 
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A related question is: 

• Is there a reasonable way to "round" an orthogonal matrix to a permutation 
matrix, just like real numbers are rounded to integers? 

To answer the second question, we suggest a simple procedure of randomized 
rounding, which, given an orthogonal matrix U produces not a single permutation 
matrix a but rather a probability distribution on the symmetric group Sn- Using 
that procedure, we show that asymptotically, as n — > +00, any orthogonal ma- 
trix U is approximated by a certain non-commutative convex combination, defined 
below, of the permutation matrices. 

(1.1) Non-commutative convex hull. Let vi, . . . ,Vm ^ V he vectors, where V 
is a real vector space. A vector 

m 

V = XiVi where 

i=l 

m 

Aj = 1 and Aj > for i = 1, . . . ,m 

is called a convex combination of fi, . . . The set of all convex combinations 

of vectors from a given set X C is called the convex hull of X and denoted 
conv(X). We introduce the following extension of the convex hull, which we call 
the non-commutative convex hull. 

Let F be a Hilbert space with the scalar product (•,•). Recall that a self- 
conjugate linear operator A on 1/ is called positive semidefinite provided {Av, v) > 
for all V E V. To denote that A is positive semidefinite, we write A ^ 0. Let / 
denote the identity operator on V. 

We say that v is a nan- commutative convex combination of vi, . . . ,Vm H 

m 

V = AiVi where 

i=l 

m 

Ai = I and Ai >z for z = 1, . . . , m. 

i=l 

The set of all non-commutative convex combinations of vectors from a given set 
X (ZV we call the non- commutative convex hull of X and denote nconv(X). 

A result of M. Naimark [Na43] describes a general way to construct operators 
Ai y such that Ai + . . . + Am = I- Namely, let T : y — > W be an embedding 
of Hilbert spaces and let T* : W — > V be the corresponding projection. Let 

W = ^^Li be a decomposition of W into a direct sum of pairwise orthogonal 

i=l 

subspaces and let Pi : W — > Li be the orthogonal projections. We let Ai = T*PiT. 
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A set of non- negative numbers Ai , . . . , summing up to 1 can be thought of 
as a probability distribution on the set {1,... ,m}. Similarly, a set of positive 
semidefinite operators Ai summing up to the identity matrix can be thought of as a 
measurement in a quantum system, see, for example, [Kr05] . While we can think of 
a convex combination of vectors as the expected value of a vector sampled from some 
set according to some probability distribution, we can think of a non-commutative 
convex combination as the expected measurement of a set of vectors. 

It is clear that nconv(X) is a convex set and that 

conv(X) C nconv(X) 

since we get a regular convex combination (1.1.1) if we choose in (1.1.2) to be 
the scalar operator of multiplication by Aj. 

(1.2) Convex hulls of the symmetric group and of the orthogonal group. 

The convex hull of the permutation matrices a E Sn, described by the Birkhoff- 
von Neumann Theorem, consists of the nxn doubly stochastic matrices A, that is, 
non-negative matrices with all row and column sums equal to 1, see, for example. 
Section II.5 of [Ba02]. 

The convex hull of the orthogonal matrices U E On consists of all the operators 
of norm at most 1, that is, of the operators A : — > R" such that \\Ax\\ < \\x\\ 
for all X G M", where || ■ || is the Euclidean norm on R"", see, for example, [Ha82]. 

In this paper, we consider the non-commutative convex hull nconv (Sn) of the 
symmetric group and show that asymptotically, as n — > -|-oo, it approximates all 
the orthogonal matrices. To state our main result, we consider the following two 
norms on matrices: the £°° norm 

||S||oo = max 

and the Probenius or £^ norm 

where B = {l3ij). 

We prove the following result. 

(1.3) Theorem. For every orthogonal nxn matrix U there exist positive semi- 
definite nxn matrices A^ b 0, a E Sn, such that 

<T€Sn 
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where I is the nxn identity matrix, and such that for the non-commutative convex 
combination 

A= J] A^(J 

CreSn 

we have 

\\U-A\\oo < c- 



and 

— A^F < clnn, 

where c is an absolute constant. 

(1.4) Discussion. We consider A^ra as the usual product oinxn matrices. Thus 
the matrix Acr acts as a hnear operator 



X I — >A^X 



on the space Matn of n x n matrices X. Identifying 

Matn = ® • • • ® R"^ 

n times 

by slicing a matrix onto its columns, we identify the action of A^j with the block- 
diagonal operator 

(A^ \ 

A^ 

A^ 

\0 A^l 

onR"©...©R". 

Hence the combination Aa-a indeed fits the definition of Section 1.1 of a 
non-commutative convex combination. 

Let = (1, . . . , 1) interpreted as a column vector. Then, for any A — A^ra 
where A^ = I, we have Av = v. In particular, if Uv ^ v, the matrix U cannot 
be exactly equal to A, so the asymptotic character of Theorem 1.3 is unavoidable. 
Taking U = —I we note that one cannot approximate U entry-wise better than 
within 1/n error, say. If t/ is a "typical" orthogonal matrix, then we have ||?7||oo ~ 
ci\/n~^ Inn for some absolute constant ci, cf., for example. Chapter 5 of [MS86]. It 
follows from our proof that for such a typical U we will have — A||<x) < C2n~^ Inn 
for some other absolute constant C2- 

We also note that \\U\\f — \fn for every U G O^, so the error in the Frobenius 
norm is exponentially small compared to the norm of the matrix. 
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It is a legitimate question whether the bounds in Theorem 1.3 can be sharpened. 
One can ask what kind of matrices one can expect to get via non-commutative 
convex combinations 

<T£Sn 

of permutation matrices. It is easy to notice that the resulting matrices A can be 
quite far away from the (usual) convex hull of the orthogonal matrices. Consider, 
for example, the following situation: for the identity permutation cr, let Acr be the 
projection onto the first coordinate, for every transposition a — {Ik), k = 2, . . . ,n, 
let A(j to be the projection onto the A;th coordinate, and for all other a, let A^^ = 0. 
Then, for A = (cty) we have au = 1 for all i and all other entries of A are 0. Thus 
the operator norm of A is y/n. 

(1.5) Rounding an orthogonal matrix to a permutation matrix. The key 

construction used in the proof of Theorem 1.3 is that of a randomized rounding of 
an orthogonal matrix to a permutation matrix. By now, the idea of randomized 
rounding (be it the rounding of a real number to an integer or the rounding of 
a positive semidefinite matrix to a vector) proved itself to be extremely useful 
in optimization and other areas, see, for example, [MR95]. Let U he an n x n 
orthogonal matrix and let a; e be a vector. Let y = Ux, so 

x= i^i,... ,^n) and y = (r]i,... ,r]n). 

Suppose that the coordinates of x are distinct and that the coordinates rji of y are 
distinct. Let (/», '0 : {1, . . . , n} — {1, . . . , n} be the orderings of the coordinates of 
X and y respectively: 

C<^(1) < ^^{2) < ■ ■ < ^(t>{n) and 7/^(1) < 7/^(2) < . . . < r]tp{n)- 

We define the rounding of U at x as the permutation a = a{U,x), a & Sn, such 
that 

a"((/)(fc)) = V'(^) foi" k = 1, . . . ,n. 

In words: a = a{U, x) matches the kth smallest coordinate of x with the kth 
smallest coordinate of y — Ux for k = 1, . . . ,n. 

Let fin be the standard Gaussian measure on with the density 

(27r)-'^/2e-ll"ll'/2 where \\xf = Cf + ... + C for a; = (6,...,U- 

If we sample x G MJ^ at random with respect to fin then with probability 1 the 
coordinates of x are distinct and the coordinates of y = Ux arc distinct. Thus the 
rounding a{U,x) is defined with probability 1. Fixing U and choosing x at random, 
we obtain a certain probability distribution on the symmetric group S^- 

The crucial observation is that for a typical x, the vector y = Ux is very close 
to the vector ax for a = a{U, x). In other words, the action of a given orthogonal 
matrix on a random vector x with high probability is very close to a permutation 
of the coordinates. However, the permutation varies as x varies. 

We prove the following result. 
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(1.6) Theorem. LetU be annxn orthogonal matrix. For x G M", leta{U,x) G Sn 
he the rounding ofU atx. Let z{x) = x — a{U, x)x and let be the ith coordinate 
of z{x). Then 

/ Ci [x) dUnKx) < c 

for some absolute constant c and i = 1, . . . , n. 

(1.7) Discussion. It follows from Theorem 1.6 that 

/ ||2;(a;)|p dunix) < cln^n. 

Thus, for a typical x e M"^, we should have 

\\Ux — a{U, x)x\\ = O (Inn) . 
This should be contrasted with the fact that for a typical a; e IR" we have 

||a;|| n^/^. 

Indeed, for any < e < 1, we have 

IJ,n^x e R*^ : ||a;||^ > j 1 < exp < — — > and 

lin[xeW: ||a;||^ < (1 -e)n| < exp|-^| , 

see, for example. Section V.5 of [Ba02]. 

Thus, for on a typical x, the action of operator U and the permutation a{U,x) 
do not differ much. 



The paper is structured as follows. 

In Section 2, we discuss some general properties of the proposed randomized 
rounding and its possible application in the Quadratic Assignment Problem, a hard 
problem of combinatorial optimization. 

In Section 3, we establish concentration inequalities for the order statistics of 
the Gaussian distribution on which the proof of Theorem 1.6 is based. 

In Section 4, we prove Theorem 1.6. 

In Section 5, we deduce Theorem 1.3 from Theorem 1.6. 

In Section 6, we conclude with some general remarks. 
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2. Randomized rounding 



The procedure described in Section 1.5 satisfies some straightforward proper- 
ties that one expects a rounding procedure to satisfy. Given a matrix U e 0„, 
the rounding (t{U,x) G Sn for x G R"' is well-defined with probability 1. Thus 
as X ranges over M"^, with every orthogonal matrix U we associate a probability 
distribution pu on Sn- 

puia) = fin[x eW : a{U,x) = ay 

In other words, pu{o') tells us how often do we get a particular permutation a & Sn 
as a rounding of U. For example, if U = —I then pu is uniform on the permuta- 
tions a that are the products of ln/2\ commuting transpositions: a{—I,x) is the 
permutation matching the smallest A;th coordinate of x to its (n — /c)th smallest 
coordinate. 

We note that if [/ is a permutation matrix itself, then a{U,x) — U with proba- 
bility 1, so permutation matrices are rounded to themselves. By continuity, if U is 
close to a permutation matrix, one can expect that the distribution pu concentrates 
around that permutation matrix. One can also show that if U is "local" , that is, 
acts on some set J of A; <^ n coordinates of x then a{U,x) is also "local" with high 
probability, that is, acts on some s <C n coordinates containing J. 

If p E Sn is a, permutation then a{pU,x) = pa{U,x). Therefore, if we fix a; G 
with distinct coordinates and sample U at random from the Haar probability 
measure on 0„, we get a probability distribution on Sn which is invariant under 
the left multiplication by Sn and hence is the uniform distribution. Thus, for any 
fixed x G with distinct coordinates, the rounding of a random matrix U G On is 
a random permutation a E Sn- Geometrically, every such an x produces a partition 
of On onto n! isometric regions, each consisting of the matrices rounded at x to a 
given permutation a & Sn- 

We also note that a{U,x) = a{U, —x). 

(2.1) Rounding in the Quadratic Assignment Problem. Let us define the 
scalar product on the space Matn of real n x n matrices by 

(^A, B'^ = ^^ciijbij for A — (aij) and B = ihij) . 
id 

Given two nxn matrices A and 5, let us consider the function / : Sn — > IR defined 

by 

f{a) = (A, aBa-^) 

(recall that we identify cr with its permutation matrix). The problem of minimizing 
/ over 5"^, known as the Quadratic Assignment Problem, is one of the hardest 
combinatorial optimization problems, see [Qe98]. It has long been known that if 
one of the matrices is symmetric (in which case the other can be replaced by its 
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symmetric part, so we may assume that both A and B are symmetric), then an 
easily computable "eigenvalue bound" is available. Namely, let 

Ai > A2 > . . . > An 

be the eigenvalues of A and let 

/Wl > /W2 > . . . > /in 

be the eigenvalues of B. Then the minimum value of / is at least 

n 

(2.1.1) 

i=l 

The bound (2.1.1) comes from extending the function / : — > IR to the function 
f :On — ^ M defined by 

fiU) = {A, UBU*). 

It is then easy to compute the minimum of / on On- 

First, we compute U\ such that UiBU^ = diag(/xi,... , //„) is the diagonal 
matrix. Next, we notice that 

f(U) ={A, UBU*) = {A, {UUl)UiBUl{UiU*)) 

={UiU*AUUl, UiBUl). 

It is then easy to see that the minimum of f{U) is achieved when UiU* — U2 such 
that U2AU2 = diag (An, . . . , Ai). Then we compute U = U^Ux. 

The eigenvalue bound (2.1.1) may be far off the minimum of / on S'n, in which 
case one would expect the optimal matrix U G to be far away from a single 
permutation matrix. Suppose, for example, that n = 2m is even. Let J be the 
m X m matrix of all I's and let 

^=(1 l)"^*^ ^"(0 -l)®*^" 

Then /(ex) = on ^n while the values of / on range from — n^/2 to n^/2. 

However, if U is close to a particular permutation matrix, that matrix may be 
recovered by rounding. 

3. Concentration for order statistics 

Let ^1,... ,^n be independent identically distributed real valued random vari- 
ables. We define their order statistics as the random variables u>i, . . . ,u>n, = 
^kiii-,--- ,^n) such that 

u)k{iiT ■ ■ I in) = the fcth smallest among ^1, . . . , in- 

Thus uji is the smallest among ^1, . . . , ^n and ujn is the largest among ^1, . . . , ^n. 
We have 

< <^2 < . . . < ^n- 

We need some concentration inequalities for order statistics. 
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(3.1) Lemma. Suppose that the cumulative distribution function F of is con- 
tinuous and strictly increasing. Let k he an integer, 1 < k < n. 

(1) Let a be a number such that F{a) < k/n < 2F{a). Then 



PK<a}<exp{-3^(^-f(a)) 

(2) Let a be a number such that F{a) > k/n. Then 

PK>a}<exp{-^(^-F(a:,) 
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Proof. Let us define random variables Xi, . . . ,Xn by 

_ r 1 if < a 
* \ otherwise 

and let X = Xi + • • • + Xn- 

Thus Xi ctre independent random variables and 



'{xi = l} = F(Q;)=p. 



We note that Wfc < ct if and only if x > ^- By Chernoff's inequality (see, for 
example, [Mc89] or [Bo91]) we get for < e < 1 



•|x>pn(l + e)| <exp|- 



e^pn 



Choosing 

e = 1 = ^ , . 1 

F(a)n 

we complete the proof in Part (1). 

Similarly in Part (2), we have a;/; > o: if and only ii x k — 1. By Chernoff's 
inequality we get for < e < 1 



•|x < - e)} < exp |- 



e^pn 



Choosing 

k 

6=1 = 1 



pn F{a)n' 

we complete the proof of Part (2). □ 
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(3.2) Corollary. 

(1) Let k he an integer, 1 < k < n. For < e < 1/2, let us define the number 
a~ = a" {k, e) from the equation 



n 

Then 



P{a;,<a-}<exp|-^^}. 



(2) Let 1 < k < n/2 be an integer. For < e < 1, let us define the number 
q;+ — q;+(A;, e) from the equation 

^ ^ n 

Then 



PH>«+}<exp{-^^} 



Next, we consider the case of the identicaUy distributed standard Gaussian ran- 
dom variables with the density 

v27r 

and the cumulative distribution function 

F{t) = f e-^'/2 dr. 

(3.3) Lemma. Let ^i, . . . , be independent standard Gaussian random variables. 
Let 1 < k < n/2 be an integer. Let 0<e<l/2bea number and let us define 
numbers a'^ = a'^{k, e) and a~ = a~{k, e) from the equations 

^/ (l + e)k , ^/ N (l-e)fc 

F (a+) = ^ ^— and F (a") = ^ 

n n 

Then 
(1) 



p|a;fc < a |<exp| — and p|a;fc > q;"''| < exp 

(2) 



< a+ - a" < — . 

1 — e 
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Proof. Part (1) is immediate from Corollary 3.2. Clearly, a'^ — a > 0. Applying 
RoUe's Theorem we get 

2ke 

= F (a'^) - F (a~) = (a'^ - a~) (j)(t*) for some a" < t* < «+. 

Using the inequality 

F{a) < e"'"^/^ = V2n(l){a) for a<0 
(cf. also formula (4.2) below), we get 



By symmetry. 



Summarizing, 



0(t)>^lL^ for 0<t<a+. 



, _ 2ke cVStt 
— a = — - — - < 



n(j){t*) 1 — e 

and the proof of Part (2) follows. □ 

4. Proof of Theorem 1.6 

We need a technical (non-optimal) estimate. 

(4.1) Lemma. Let f : — > R be a function such that f{Xx) = Xf{x) for all 
X G M" and all X>0. Let 



B=\xeW: \\x\ 



be a ball of radius n and let Hn be the standard Gaussian measure on W^. Then 
there exists a constant c such that 

f^ dUn <C f'^ djln 

Jb 

for all n. 

Proof. Let 5 C M"^ be the unit sphere. Passing to the polar coordinates, we get 

/ P dfin = (27r)--/2 f / /2 dx) /^"r+ie-*'/2 dt 
Jk" \Js J Jo 
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and, similarly, 



J f dfin = (27r)-/2 Q f dx^ ^"r+^e-*'/2 dt. 



Furthermore, we have 



+00 / I 9 





For all sufficiently large n, we have 

^n+ig-tV2 < e-tV4 for all t > n, 

so we have 

+00 

^n+l^-tV2 d^ < ^ 

for some constant c and all n. The proof now follows. □ 
Apart from Lemma 4.1, we need the estimate: 

(4.2) iin{x = (6, . . . , Cn) : m >t}< 2e-*'/2 

for any t > and any z = 1, . . . , n, see, for example. Section V.5 of [Ba02]. Now 
we can prove Theorem 1.6. 

Proof of Theorem 1.6. Let B be the ball of radius n in R" centered at the origin. 
By Lemma 4.1 it suffices to prove the estimate for the integral 

Cl{x) d/Inix). 

B 

Without loss of generality, we may assume that i = 1 so that ({x) = Ci{x) is the 
first coordinate of Ux — a{U, x)x. 

Let Vfe C -B be the subset of x E B such the first coordinate of C/x is the kth 
smallest among the coordinates of U x. Then Vi , . . . , are polyhedral (generally, 
non-convex) sets that cover B and intersect only at boundary points. Since B is 
0„-invariant, the sets \4 are isometric and so we have 

/.„(yi) = --- = /in(K) = ^^<-. 

n n 

Thus we have 

/ C^(x) dnn{x) = ^^(^^^ dunix). 
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In what follows, Cj for z = 1, 2, . . . denote various absolute constants. 
We note that for any x E B we have \C{x)\ < 2n. Moreover, by (4.2) 

IJ,n^x e 14 : |C(^)I ^ ciVlnnj < for all sufficiently large n. 

Therefore, 

(4.3) / C'^{x) diin{x) < C2{lnn)n~^ for all k. 

JVk 

For 

36 Inn <k<n/2 

and all sufficiently large n we get a better estimate via Lemma 3.3. Namely, let us 
choose e = = 3A;~^/^ Vlnn in Lemma 3.3 and let and be the corresponding 
bounds. It follows that for 36 Inn < k < n/2 and all sufficiently large n we have 



and, similarly, 

where 
Hence 



< CK^ — CK^ < csk~'^Vhin. 



Hn^x e R" : \uJk{Ux) - Ukix)\ > csfc" Wlnnj < 2n~^. 
Since for x e we have ({x) = UkiUx) — Uk{x), we conclude 
fXn^xeVk-. \C{x)\ > cs/c" Wlnnj < 2n~^ 

and 

(4.4) / C^{x) di^nix) <C4n-^k-'^\nn for 361nn</c<n/2 

and all sufficiently large n. 

Summarizing (4.3) and (4.4), we get 

J2 I e{^)d^in{x) 

= H I C\x)dfXn{x)+ I C\x)dfXn{x) 

l<fe<361nn'^^fe 36 In n<fe<n/2 '^^'^ 

2 1 

<C5 (Inn) n~ . 
Since by the symmetry x <-> —x we have 

C^(a;) (i//n(a;) = / (^(x) d^nix), 

the proof follows. □ 
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5. Proof of Theorem 1.3 
First, we introduce some notation. 

For vectors x = (Cij • • • , and y = (rji, . . . , rjn) let x <S>y be the n x n matrix 
with the (i, j)th entry equal to ^iijj. 

We observe that for any n x n matrix A we have 

A{x <^y) = (Ax) (g) y, 

where the product in the left hand side we interpret as the product of matrices and 
the product Ax in the right hand side we interpret as a product of a matrix and a 
column vector. 
Let 

n 

{x,y) = ^^iVi for a;= (Ci,... ,Cn) and y = (r/i, . . . , r/^) 

be the standard scalar product in R". Then for all x,y,a eW^, we have 

(5.1) {x <Si y) a = {a,y)x. 
Let 

||a;|| = y/{x,x) for a; e M*^ 

be the usual Euclidean norm of a vector. 
We need a couple of technical results. 

(5.2) Lemma. Let L be an n x n matrix. Then 

Proof. Let a = {ai, . . . , an), where ctj are independent standard Gaussian random 
variables. Then 



n in 



i=i \j=i 



Since EctjCij — for i ^ j and Eq;| = 1, taking the expectation we get 

n 

□ 
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(5.3) Lemma. Let f : — > IR be an integrahle function such that 

I f^{x) djin{x) < +00 and i ||a;|p/^(a;) djin{x) < +00. 



Then 



n \ .ran 



{a,x)f{x) dunix)] d/inia) < / f{x)diJ.n{x). 



Proof. Let C be the subspace of the Hilbert space L^(M",//„) consisting of the 
hnear functions and let be its orthogonal complement. We write 



/ = {b,x) + h, 



where & G R"" and /i G 
Hence we have 



f'^{x) dnn{x)> I {b,xf dnn{x) ^ {b,b) 



and 

/ {a,x)f{x) diJ,n{x) = {a,x){b,x) diJ,n{x) = {a,b). 

Therefore, 

/ (/ {a,x)f{x)diJ,n{x)\ diJ,n{a) = {a,bf diJ,n{a) 

={b,b)< f f\x)d^r.{x) 
7r" 

as claimed. □ 

Now we are ready prove Theorem 1.3. 

Proof of Theorem 1.3. Given an orthogonal matrix C/, we will construct a matrix 
A approximating U as desired in the form 

^ = ^ aA„, where ^ A^. = / and y 0. 

To get the approximation of the type 

A= J2 

cr&Sn 

claimed in the Theorem, one should apply the construction to U*. 
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Let a{U, x) be the rounding of U at x and let us define 

X^=^xeW : a{U, x) = a} for a E Sr, 



and 



= / x®xdiJin{x). 



Clearly, are positive semidefinite and 



Act = / X <S> X djinix) = I ■ 



On the other hand, 



{Ux — (t{U, x)x) ® a; djin^x). 



Let 

L— (Ux — a{U,x)x) ® x diin{x) — I z{x) ® x dunix) 
in the notation of Theorem 1.3. Thus L is an n x n matrix, L = (Zy) and 

U-A = L. 

Using Theorem 1.6, we estimate lij. Denoting Cji^) the jth coordinate of x, we get 
from Theorem 1.6 



"ij I 



1/2 / |. X 1/2 



<cn 2 Inn, 



from which we get 



as desired. 

Finally, we estimate 



\U - A\\oo < cn ^ Inn 



\U - A\\f = \\L\\f 
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using Lemma 5.2. By formula (5.1) for a e R"' we have 

La = {a,x) {Ux — a{U,x)x) diin{x) — I {a,x)z{x) dfinix) 
in the notation of Theorem 1.6. Let us estimate 



The ith coordinate Aj(a) of La is 

>'i{a)= / {a,x)Ci{x) diin{x). 

By Lemma 5.3, 



2 

,2 



In^ n 



\{a)diJin{a)= / {a,x)C,i{x) diin{x)] diin{a) 



n 

by Theorem 1.3. Therefore, 

II-^IIf — / ll-^'^P dunio) — / ^^,{0) dnri{tt) < cln^n 

as desired. □ 

6. Concluding remarks 

A somewhat stronger estimate follows from our proof of Theorem 1.3. Namely, 
let tti, . . . ,Un be the column vectors of U and let ai, . . . , be the column vectors 
ofyl. Then 

II II ^ f -1 

\\Ui — ai\\ < c—p= tor z = l,...,n, 

where || • || is the Euclidean norm in W^. 

It follows from our construction of matrices A„ in the proof of Theorem 1.3 that 
the trace of is equal to the probability that the matrix U* is rounded to the 
permutation a. 

One can easily construct small approximate non-commutative convex combina- 
tions 

AT 

with 

N 

b and ^AiK. I 

by sampling points Xi at random from the Gaussian distribution computing 
the rounding ai = a {U*,Xi) and letting Ai = Xi <S) Xi. 
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